3.2.4. Tips for parameter search Specifying an objective metric
By default, parameter search uses the score function of the estimator to evaluate a parameter setting.
These are the sklearn.metrics.accuracy_score for classification and sklearn.metrics.r2_score for regression.
For some applications, other scoring functions are better suited (for example in unbalanced classification, the accuracy score is often uninformative).
An alternative scoring function can be specified via the scoring parameter of most parameter search tools.
「代わりのスコア関数はほとんどのパラメタ検索ツールのscoring引数を介して指定できる」 Specifying multiple metrics for evaluation
GridSearchCV and RandomizedSearchCV allow specifying multiple metrics for the scoring parameter.
HalvingRandomSearchCV and HalvingGridSearchCV do not support multimetric scoring. Composite estimators and parameter spaces
Composite estimators and parameter spaces
Model selection by evaluating various parameter settings can be seen as a way to use the labeled data to “train” the parameters of the grid.
When evaluating the resulting model it is important to do it on held-out samples that were not seen during the grid search process:
it is recommended to split the data into a development set (to be fed to the GridSearchCV instance) and an evaluation set to compute performance metrics.
「(GridSearchCVインスタンスに与える)開発セットと性能指標を計算するための評価セットにデータを分けるのが推奨される」 Parallelism
n_jobs引数 Robustness to failure
Some parameter settings may result in a failure to fit one or more folds of the data.
By default, this will cause the entire search to fail, even if some parameter settings could be fully evaluated.
Setting error_score=0 (or =np.NaN) will make the procedure robust to such failure, issuing a warning and setting the score for that fold to 0 (or NaN), but completing the search.